Initial support for listing voices and languages #7

panaC · 2024-08-13T16:25:53Z

No description provided.

Fixes some issue on filterOnRecommendec lowQuality output

HadrienGardeur · 2024-08-13T16:37:36Z

You can convert this to a draft PR and move it back to a PR once it's ready rather than putting [WIP] as a prefix.

chocolatkey · 2024-08-14T23:59:40Z

src/voices.ts

+
+    return new Promise((resolve, _reject) => {
+
+        let counter = 1000;


Can you explain the reason that you need to check the voices every 10ms 1000 times? I imagine it's a hack to deal with browsers, but why those values?

it's 1000 times at the worst condition, which end with the return of an empty array.
In Firefox and Safari Speech API works at start both in a script in the header deferred or not
In Chrome Speech API is available just before the fire of the 'DOMContentLoaded' event in a deferred script
In Edge the more slowness browser Speech API is available in about 60ms in my computer.

So more or less a counter between 10 to 100 is better than 1000 that could be too long.

Do you think to an alternative to catch the result of getVoices() ?

script/extract-json.mjs

chocolatkey · 2024-08-15T00:12:20Z

src/data.ts

@@ -0,0 +1,26 @@
+
+


Something to think about in a future version is to come up with a way of compressing this data. Adding all these strings to a JS bundle is a significant size increase (est. ~60KB based on minified data.js). This could be done with, for example, the CompressionStream API and/or a more efficient encoding. Let me know your thoughts

In Thorium / navigator in some cases we pass lots of textual data (serialized JSON) via URL query parameters to guarantee that the webview / iframe resolves the data "synchronously" (instant access). We use code that looks like this (there is some usage of NodeJS / Electron Buffer API but nothing that couldn't be pure Web API):

const json = { /* LOTS OF DATA */ }; const jsonStr = JSON.stringify(json); const cs = new CompressionStream("gzip"); const csWriter = cs.writable.getWriter(); csWriter.write(new TextEncoder().encode(jsonStr)); csWriter.close(); const buff = Buffer.from(await new Response(cs.readable).arrayBuffer()); const b64 = buff.toString("base64"); // pass b64 as escaped URL query parameter

The inverse operation:

const buff = Buffer.from(b64, "base64"); const cs = new DecompressionStream("gzip"); const csWriter = cs.writable.getWriter(); csWriter.write(buff); csWriter.close(); const buffer = Buffer.from(await new Response(cs.readable).arrayBuffer()); const jsonStr = new TextDecoder().decode(buffer); const json = JSON.parse(jsonStr);

HadrienGardeur · 2024-08-15T11:14:54Z

Based on this latest commit, I've identified a bug with Finnish and Filipino.

In Edge, while testing the demo I noticed that the Filipino voices are listed under Finnish > Philippines and that Filipino is not listed in the list of languages.
Finnish voices use fi-FI while Filipino voices use fil-PH. It seems that three letter codes for languages are not properly and could end up grouped with two letter codes that start with the same letters.

HadrienGardeur · 2024-08-15T13:39:44Z

I've also inspected the output in Edge and noticed that the Microsoft Natural Voices have an incorrect value for pitchControl;

This value seems to be skipped by the code importing the JSON data.

build/src/voices.js

panaC · 2024-08-16T08:05:26Z

I've also inspected the output in Edge and noticed that the Microsoft Natural Voices have an incorrect value for pitchControl;

This value seems to be skipped by the code importing the JSON data.

I don't understand where you set the pitchControl value for the voice, even in the demo mode there is no pitchControl setting in SpeechSynthesisUtterance. Did I miss something ?

…raison instead of startsWith

HadrienGardeur · 2024-08-20T07:59:05Z

On Android I've noticed that I cannot select some voices available in the demo. All languages supported officially work fine, but if I select "Chinese" for example, the list of voices is never loaded.

Here's a list of languages that seem affected:

yue
su
Serbian
sd
sat
Pendjabi
mni
ks
Hindi
bs

As you can notice, quite a lot of languages also seem to have missing translations in Intl.DisplayNames, although their voice names are properly translated by Google.

Instead of "hello world", I think that we should just default to an empty string when there's no test utterance.

…trict equality when no quality and altNames are available (default comparaison)

panaC · 2024-08-23T16:44:27Z

Fixes #6

Sort by Language/region
Sort by gender
Sort by name
Sort by quality

Fixes #5

group by language
group by regions
group by kind of voices

Fixes #4

filter on offline availability
filter on gender
filter on quality
filter on novelty
filter on veryLowQuality
filter on recommended

Fixes #2
Fixes #1

function getVoices(): Promise<IVoices[]>

export interface IVoices {
    label: string;
    voiceURI: string;
    name: string;
    language: string;
    gender?: TGender | undefined;
    age?: string | undefined;
    offlineAvailability: boolean;
    quality?: TQuality | undefined;
    pitchControl: boolean;
    recommendedPitch?: number | undefined;
    recommendedRate?: number | undefined;
}

remanage : sortBy function to handle localization by default remove the promise return of getLanguages, only getVoices can fetch SpeechSynthesisVoices and parse it.

danielweck · 2024-09-04T16:19:10Z

script/extract-json.mjs

+export const defaultRegion = ${JSON.stringify(defaultRegion)};
+`;
+
+const filePath = './src/data.ts';


This source file is script-generated, may I suggest:

adding a comment header at the top of the file that explains which script produced the Typescript code, and at what date/time (or ideally: git revision / commit hash).

locate the generated file inside a "gen" subfolder or the "src" source tree, or to rename the file e.g. src/data.gen.ts

danielweck · 2024-09-04T17:11:21Z

src/voices.ts

+            if (Array.isArray(voices) && voices.length) return resolve(voices);
+            setTimeout(tick, 10);
+        }
+        setTimeout(tick, 10);


setTimeout() has a resolution of about 20ms, last time I checked. Also, IIRC this is affected by browser window visibility (it doesn't really matter for this use-case though)

... I think that a 200ms tick would be fine for this.

JayPanoz · 2024-09-24T10:57:27Z

README.md

+function filterOnGender(voices: IVoices[], gender: TGender): IVoices[]
+
+function filterOnGender(voices: IVoices[], gender: TGender): IVoices[]


Remove duplicate

Suggested change

function filterOnGender(voices: IVoices[], gender: TGender): IVoices[]

panaC added 14 commits August 6, 2024 18:27

setup typescript, import and parse JSON file to generate a data js file

f76d089

working on getVoices() with filter and sorter

ef18a9d

add tests on sortByLanguage and filterOnRecommended

ecb0418

Fixes some issue on filterOnRecommendec lowQuality output

groupBy fonction + test

2780645

github ci

b7cb0fa

update demo

0f9c4c3

deploy to github pages github/actions

8a5c0f6

gh-pages : try root folder to include demo

0171924

typo root folder gh-pages

864c467

no need to have a gh-pages action statically build at each push commit

1eeb4ed

fix: preferredLanguage and disable the language sorting function

94a05b0

fix: typo on broken test and lowerQuality typo

e36bbe7

fixes some issue and add localization to groupBy

5850a28

add localization to the demo

50c15ae

panaC self-assigned this Aug 13, 2024

panaC added 2 commits August 13, 2024 18:51

fix: demo utterance URL

79b6872

fix: label demo

0d9c6c2

panaC marked this pull request as draft August 13, 2024 17:29

HadrienGardeur requested a review from chocolatkey August 14, 2024 08:39

HadrienGardeur changed the title ~~[WIP] Readium Speech JS library development 🙈~~ Initial support for listing voices and languages Aug 14, 2024

chocolatkey reviewed Aug 15, 2024

View reviewed changes

fix filterOnRecommended: push to lowerQuality the altNames voices found

b88c1f2

HadrienGardeur reviewed Aug 15, 2024

View reviewed changes

build/src/voices.js Outdated Show resolved Hide resolved

panaC added 3 commits August 16, 2024 10:47

fix: bcp47 language matching between 'fi' and 'fil' with strict compa…

68449c8

…raison instead of startsWith

chore: comment on demo file

05b2831

chore: switch the data json src to the main branch

6fa11ad

HadrienGardeur added the voice-selection label Aug 20, 2024

panaC added 5 commits August 23, 2024 16:59

fix readium#8: improve demo with offline availability and gender filter

a3e58a8

fix filterOnRecommended: change name comparaison from startsWith to s…

4f8c108

…trict equality when no quality and altNames are available (default comparaison)

fix demo: up

15f15bb

fix demo: add push to logs server button

405ad07

fix: handle Intl.displayName?.of exception

d5e4a20

panaC added 3 commits August 23, 2024 18:48

fix demo: quick fix on voicesGroupedByRegions maybe undefined

d329173

fix demo: up previous commit

9d585ac

fix: demo gender update

597d389

jacobwsmith777 approved these changes Aug 26, 2024

View reviewed changes

panaC added 12 commits August 29, 2024 14:06

new demo

30405ba

update README

05eab69

update dataset

019328c

update dataset

1d26554

fix demo: text input

fae7627

Fixes: sortByLanguage and sortByRegion

b8d5fc2

remanage : sortBy function to handle localization by default remove the promise return of getLanguages, only getVoices can fetch SpeechSynthesisVoices and parse it.

fix: window.navigator.language?s potentially undefined in node

5c6d54c

chore: update dataset

88aa770

chore: update dataset with new json voice and build cjs,mjs

ac6127b

fix: demo textToRead formated

3512190

chore: update dataset grec

eef198d

fix: demo

c9fcfad

panaC mentioned this pull request Aug 30, 2024

Speech TTS voice selector edrlab/thorium-reader#2513

Draft

fix demo

55e98e6

danielweck requested changes Sep 4, 2024

View reviewed changes

HadrienGardeur requested a review from JayPanoz September 24, 2024 08:34

JayPanoz approved these changes Sep 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial support for listing voices and languages #7

Initial support for listing voices and languages #7

panaC commented Aug 13, 2024 •

edited

Loading

HadrienGardeur commented Aug 13, 2024

chocolatkey Aug 14, 2024

panaC Aug 15, 2024

chocolatkey Aug 15, 2024 •

edited

Loading

danielweck Sep 4, 2024

HadrienGardeur commented Aug 15, 2024

HadrienGardeur commented Aug 15, 2024

panaC commented Aug 16, 2024

HadrienGardeur commented Aug 20, 2024

panaC commented Aug 23, 2024

danielweck Sep 4, 2024

danielweck Sep 4, 2024

danielweck Sep 4, 2024

JayPanoz Sep 24, 2024


		return new Promise((resolve, _reject) => {

		let counter = 1000;

		function filterOnGender(voices: IVoices[], gender: TGender): IVoices[]

		function filterOnGender(voices: IVoices[], gender: TGender): IVoices[]

Initial support for listing voices and languages #7

Are you sure you want to change the base?

Initial support for listing voices and languages #7

Conversation

panaC commented Aug 13, 2024 • edited Loading

HadrienGardeur commented Aug 13, 2024

chocolatkey Aug 14, 2024

Choose a reason for hiding this comment

panaC Aug 15, 2024

Choose a reason for hiding this comment

chocolatkey Aug 15, 2024 • edited Loading

Choose a reason for hiding this comment

danielweck Sep 4, 2024

Choose a reason for hiding this comment

HadrienGardeur commented Aug 15, 2024

HadrienGardeur commented Aug 15, 2024

panaC commented Aug 16, 2024

HadrienGardeur commented Aug 20, 2024

panaC commented Aug 23, 2024

danielweck Sep 4, 2024

Choose a reason for hiding this comment

danielweck Sep 4, 2024

Choose a reason for hiding this comment

danielweck Sep 4, 2024

Choose a reason for hiding this comment

JayPanoz Sep 24, 2024

Choose a reason for hiding this comment

panaC commented Aug 13, 2024 •

edited

Loading

chocolatkey Aug 15, 2024 •

edited

Loading